Skip to content

[ENH] V1 -> V2 Migration : Runs#1616

Open
Omswastik-11 wants to merge 235 commits intoopenml:mainfrom
Omswastik-11:runs-migration-stacked
Open

[ENH] V1 -> V2 Migration : Runs#1616
Omswastik-11 wants to merge 235 commits intoopenml:mainfrom
Omswastik-11:runs-migration-stacked

Conversation

@Omswastik-11
Copy link
Contributor

@Omswastik-11 Omswastik-11 commented Jan 15, 2026

Metadata

  • Reference Issue:
  • New Tests Added:
  • Documentation Updated:
  • Change Log Entry:

Details

fixes #1624

Copy link
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please sync with base PR and update with these comments #1576 (comment)

Copilot AI review requested due to automatic review settings February 25, 2026 15:42
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 52 out of 53 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


configurable_fields = [f for f in config._defaults if f not in ["max_retries"]]
configurable_fields = [
f.name for f in fields(openml._config.OpenMLConfig) if f.name not in ["max_retries"]
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code filters out a field named 'max_retries', but the OpenMLConfig dataclass does not contain a field with this name. The actual field is connection_n_retries. This filter condition should be updated to match the actual field name or removed if no longer needed.

Suggested change
f.name for f in fields(openml._config.OpenMLConfig) if f.name not in ["max_retries"]
f.name
for f in fields(openml._config.OpenMLConfig)
if f.name not in ["connection_n_retries"]

Copilot uses AI. Check for mistakes.
# Example script which will appear in the upcoming OpenML-Python paper
# This test ensures that the example will keep running!
with overwrite_config_context(
with openml.config.overwrite_config_context( # noqa: F823
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The # noqa: F823 comment appears incorrect for this context. F823 is a Flake8 error code for "local variable referenced before assignment", which doesn't apply here. This line should likely use # noqa: F401 (unused import) if anything, but since openml is used on line 12, it's likely not needed at all and should be removed.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 27, 2026 08:18
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 56 out of 57 changed files in this pull request and generated 10 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 114 to +120
def test_switch_to_example_configuration(self):
"""Verifies the test configuration is loaded properly."""
# Below is the default test key which would be used anyway, but just for clarity:
openml.config.apikey = "any-api-key"
openml.config.server = self.production_server
openml.config.set_servers("production")

openml.config.start_using_configuration_for_example()

assert openml.config.apikey == TestBase.user_key
assert openml.config.server == self.test_server
openml.config.servers = openml.config.get_servers("test")
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't verify that start_using_configuration_for_example() actually switched to the test server configuration. It only sets openml.config.servers to test servers after the switch, which doesn't test the intended behavior. Add assertions to verify that the configuration was correctly switched to test servers.

Copilot uses AI. Check for mistakes.
Comment on lines 123 to +130
def test_switch_from_example_configuration(self):
"""Verifies the previous configuration is loaded after stopping."""
# Below is the default test key which would be used anyway, but just for clarity:
openml.config.apikey = TestBase.user_key
openml.config.server = self.production_server
openml.config.set_servers("production")

openml.config.start_using_configuration_for_example()
openml.config.stop_using_configuration_for_example()

assert openml.config.apikey == TestBase.user_key
assert openml.config.server == self.production_server
openml.config.servers = openml.config.get_servers("production")
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't verify that stop_using_configuration_for_example() actually restored the production server configuration. It only sets openml.config.servers to production servers after the stop, which doesn't test the intended behavior. Add assertions to verify that the configuration was correctly restored.

Copilot uses AI. Check for mistakes.
assert "flow_id" in runs_df.columns


def test_run_v1_publish_mocked(run_v1, use_api_v1, test_api_key):
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixture name test_api_key is not defined in conftest.py. The available fixture is named test_apikey_v1. Either rename this parameter to test_apikey_v1 or create a fixture named test_api_key.

Copilot uses AI. Check for mistakes.

def _mocked_perform_api_call(call, request_method):
url = openml.config.server + "/" + call
url = openml.config.server + call
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an extra space before the + operator on this line. While this doesn't affect functionality, it's inconsistent with the surrounding code style. Consider removing the extra space.

Copilot uses AI. Check for mistakes.
assert "flow_id" in runs_df.columns


def test_run_v1_publish_mocked(run_v1, use_api_v1, test_api_key):
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixture name use_api_v1 is not defined anywhere in the codebase. This fixture is referenced but never created, which will cause the test to fail.

Copilot uses AI. Check for mistakes.
)


def test_run_v1_delete_mocked(run_v1, use_api_v1, test_api_key):
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixture name test_api_key is not defined in conftest.py. The available fixture is named test_apikey_v1. Either rename this parameter to test_apikey_v1 or create a fixture named test_api_key.

Copilot uses AI. Check for mistakes.
)


def test_run_v1_delete_mocked(run_v1, use_api_v1, test_api_key):
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixture name use_api_v1 is not defined anywhere in the codebase. This fixture is referenced but never created, which will cause the test to fail.

Copilot uses AI. Check for mistakes.
"openml_logger",
"_examples",
"OPENML_CACHE_DIR_ENV_VAR",
"OPENML_SKIP_PARQUET_ENV_VAR",
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attribute OPENML_TEST_SERVER_ADMIN_KEY_ENV_VAR is set in __init__ but is not included in the whitelist in __setattr__. This could cause issues if code tries to set this attribute after initialization. Add it to the whitelist on line 166-177.

Suggested change
"OPENML_SKIP_PARQUET_ENV_VAR",
"OPENML_SKIP_PARQUET_ENV_VAR",
"OPENML_TEST_SERVER_ADMIN_KEY_ENV_VAR",

Copilot uses AI. Check for mistakes.
"_examples",
"OPENML_CACHE_DIR_ENV_VAR",
"OPENML_SKIP_PARQUET_ENV_VAR",
"_HEADERS",
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attribute _defaults is set in __init__ but is not included in the whitelist in __setattr__. This could cause issues if code tries to set this attribute after initialization. Add it to the whitelist on line 166-177.

Suggested change
"_HEADERS",
"_HEADERS",
"_defaults",

Copilot uses AI. Check for mistakes.
Comment on lines +413 to +422
self._config = replace(
self._config,
servers=config["servers"],
api_version=config["api_version"],
fallback_api_version=config["fallback_api_version"],
show_progress=config["show_progress"],
avoid_duplicate_runs=config["avoid_duplicate_runs"],
retry_policy=config["retry_policy"],
connection_n_retries=int(config["connection_n_retries"]),
)
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _setup method expects the config dictionary to contain fields like "servers", "api_version", and "fallback_api_version", but _parse_config returns the raw config parser output which may not contain these fields unless they are explicitly set in the config file. This will cause a KeyError when trying to access these fields. Either add default handling for missing fields or ensure _parse_config returns all required fields with defaults.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 27, 2026 16:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 56 out of 57 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +413 to +417
self._config = replace(
self._config,
servers=config["servers"],
api_version=config["api_version"],
fallback_api_version=config["fallback_api_version"],
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenMLConfigManager._setup() assumes values returned by _parse_config() are already typed (e.g. servers as a dict and api_version as APIVersion). However, configparser.RawConfigParser(...).items() returns strings, so servers, api_version, and fallback_api_version will be stringified defaults when the config file is missing/legacy, breaking later indexing like self.servers[self.api_version]. Consider keeping these fields out of the config file (or explicitly parsing them) and only updating them when they’re provided as proper objects (e.g. from in-memory overrides/tests).

Suggested change
self._config = replace(
self._config,
servers=config["servers"],
api_version=config["api_version"],
fallback_api_version=config["fallback_api_version"],
# Determine servers configuration, ensuring it remains a properly typed dict.
servers = self._config.servers
if "servers" in config and isinstance(config["servers"], dict):
servers = config["servers"]
# Determine API version, accepting either an APIVersion instance or a convertible string.
api_version = self._config.api_version
if "api_version" in config:
raw_api_version = config["api_version"]
if isinstance(raw_api_version, APIVersion):
api_version = raw_api_version
elif isinstance(raw_api_version, str):
try:
api_version = APIVersion(raw_api_version)
except ValueError:
self.openml_logger.warning(
"Invalid api_version '%s' in configuration; using default '%s'.",
raw_api_version,
self._config.api_version.value,
)
# Determine fallback API version, allowing None, APIVersion, or a convertible string.
fallback_api_version = self._config.fallback_api_version
if "fallback_api_version" in config:
raw_fallback = config["fallback_api_version"]
if raw_fallback is None or raw_fallback == "":
fallback_api_version = None
elif isinstance(raw_fallback, APIVersion):
fallback_api_version = raw_fallback
elif isinstance(raw_fallback, str):
try:
fallback_api_version = APIVersion(raw_fallback)
except ValueError:
self.openml_logger.warning(
"Invalid fallback_api_version '%s' in configuration; using default '%s'.",
raw_fallback,
(
self._config.fallback_api_version.value
if self._config.fallback_api_version is not None
else None
),
)
self._config = replace(
self._config,
servers=servers,
api_version=api_version,
fallback_api_version=fallback_api_version,

Copilot uses AI. Check for mistakes.
OpenMLHashException
If checksum verification fails.
"""
url = urljoin(self.server, path)
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urljoin(self.server, path) will drop the last path segment if self.server does not end with / (e.g. base .../api/v1/xml + task/1 becomes .../api/v1/task/1). Since openml.config.server can be set by users without a trailing slash (and legacy configs likely do), normalize the base URL (ensure trailing slash) before calling urljoin, or avoid urljoin for simple concatenation.

Suggested change
url = urljoin(self.server, path)
url = f"{self.server.rstrip('/')}/{path.lstrip('/')}"

Copilot uses AI. Check for mistakes.
if TYPE_CHECKING:
from ._config import OpenMLConfigManager

config: OpenMLConfigManager = _config_module.__config
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openml.config is now an attribute (config manager instance), but the openml/config.py module was removed. This breaks existing user code that does import openml.config or patches via that module path. Consider restoring a lightweight openml/config.py shim (re-exporting from openml._config) or registering an alias in sys.modules to preserve the import path for backward compatibility.

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +92
Parameters
----------
config : Config
Configuration object containing API versions, endpoints, cache
settings, and connection parameters.
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build() docstring documents a config : Config parameter, but the method signature is (api_version, fallback_api_version) and there is no config argument. Update the docstring to match the actual parameters to avoid misleading API documentation.

Copilot uses AI. Check for mistakes.
Comment on lines +34 to +38
def sample_download_url_v1(test_server_v1) -> str:
server = test_server_v1.split("api/")[0]
endpoint = "data/v1/download/1/anneal.arff"
url = server + endpoint
return url
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sample_download_url_v1 builds a download URL using data/v1/download/..., but the rest of the codebase (e.g. openml._api_calls._file_id_to_url) uses the unversioned /data/download/<id> pattern. If the test server doesn’t expose data/v1/download, these tests will 404. Consider generating the URL via the same helper used in production code (or align the hardcoded path with the actual download endpoint).

Copilot uses AI. Check for mistakes.
Comment on lines +27 to +34
@pytest.fixture
def dummy_task_v2(http_client_v2, minio_client) -> DummyTaskV1API:
return DummyTaskV2API(http=http_client_v2, minio=minio_client)


@pytest.fixture
def dummy_task_fallback(dummy_task_v1, dummy_task_v2) -> DummyTaskV1API:
return FallbackProxy(dummy_task_v2, dummy_task_v1)
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixture return type annotations don’t match what’s returned: dummy_task_v2 returns DummyTaskV2API (not DummyTaskV1API), and dummy_task_fallback returns a FallbackProxy. Even if mypy ignores tests, keeping the annotations accurate helps IDEs/readability.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] V1 → V2 API Migration - runs

8 participants